Dataset statistics
| Number of variables | 29 |
|---|---|
| Number of observations | 742 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 275 |
| Duplicate rows (%) | 37.1% |
| Total size in memory | 168.2 KiB |
| Average record size in memory | 232.2 B |
Variable types
| BOOL | 11 |
|---|---|
| CAT | 11 |
| NUM | 7 |
Reproduction
| Analysis started | 2020-06-21 10:58:09.382623 |
|---|---|
| Analysis finished | 2020-06-21 10:58:36.979878 |
| Duration | 27.6 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
| Dataset has 275 (37.1%) duplicate rows | Duplicates |
Location has a high cardinality: 200 distinct values | High cardinality |
Headquarters has a high cardinality: 198 distinct values | High cardinality |
Industry has a high cardinality: 60 distinct values | High cardinality |
name has a high cardinality: 343 distinct values | High cardinality |
max_salary is highly correlated with min_salary and 1 other fields | High correlation |
min_salary is highly correlated with max_salary and 1 other fields | High correlation |
avg_salary is highly correlated with min_salary and 1 other fields | High correlation |
Sector is highly correlated with Industry | High correlation |
Industry is highly correlated with Sector | High correlation |
comp has 460 (62.0%) zeros | Zeros |
Rating
Real number (ℝ)
| Distinct count | 31 |
|---|---|
| Unique (%) | 4.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.6188679245283017 |
|---|---|
| Minimum | -1.0 |
| Maximum | 5.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.8 KiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 2.6 |
| Q1 | 3.3 |
| median | 3.7 |
| Q3 | 4 |
| 95-th percentile | 4.7 |
| Maximum | 5 |
| Range | 6 |
| Interquartile range (IQR) | 0.7 |
Descriptive statistics
| Standard deviation | 0.8012101585 |
|---|---|
| Coefficient of variation (CV) | 0.2213980104 |
| Kurtosis | 14.30412724 |
| Mean | 3.618867925 |
| Median Absolute Deviation (MAD) | 0.35 |
| Skewness | -2.814019554 |
| Sum | 2685.2 |
| Variance | 0.641937718 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 3.9 | 63 | 8.5% | |
| 3.8 | 61 | 8.2% | |
| 3.7 | 61 | 8.2% | |
| 3.5 | 49 | 6.6% | |
| 4 | 47 | 6.3% | |
| 3.6 | 46 | 6.2% | |
| 3.4 | 44 | 5.9% | |
| 3.3 | 39 | 5.3% | |
| 3.2 | 35 | 4.7% | |
| 4.4 | 33 | 4.4% | |
| Other values (21) | 264 | 35.6% |
| Value | Count | Frequency (%) | |
| -1 | 11 | 1.5% | |
| 1.9 | 3 | 0.4% | |
| 2.1 | 5 | 0.7% | |
| 2.2 | 2 | 0.3% | |
| 2.3 | 2 | 0.3% |
| Value | Count | Frequency (%) | |
| 5 | 5 | 0.7% | |
| 4.8 | 9 | 1.2% | |
| 4.7 | 31 | 4.2% | |
| 4.6 | 10 | 1.3% | |
| 4.5 | 7 | 0.9% |
| Distinct count | 200 |
|---|---|
| Unique (%) | 27.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| New York, NY | 55 |
|---|---|
| San Francisco, CA | 49 |
| Cambridge, MA | 47 |
| Chicago, IL | 32 |
| Boston, MA | 23 |
| Other values (195) |
| Value | Count | Frequency (%) | |
| New York, NY | 55 | 7.4% | |
| San Francisco, CA | 49 | 6.6% | |
| Cambridge, MA | 47 | 6.3% | |
| Chicago, IL | 32 | 4.3% | |
| Boston, MA | 23 | 3.1% | |
| San Jose, CA | 13 | 1.8% | |
| Pittsburgh, PA | 12 | 1.6% | |
| Rockville, MD | 11 | 1.5% | |
| Washington, DC | 11 | 1.5% | |
| Winston-Salem, NC | 10 | 1.3% | |
| Other values (190) | 479 | 64.6% |
Length
| Max length | 33 |
|---|---|
| Median length | 13 |
| Mean length | 13.1509434 |
| Min length | 8 |
| Distinct count | 198 |
|---|---|
| Unique (%) | 26.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| New York, NY | 52 |
|---|---|
| San Francisco, CA | 42 |
| Chicago, IL | 30 |
| Cambridge, MA | 20 |
| Winston-Salem, NC | 14 |
| Other values (193) |
| Value | Count | Frequency (%) | |
| New York, NY | 52 | 7.0% | |
| San Francisco, CA | 42 | 5.7% | |
| Chicago, IL | 30 | 4.0% | |
| Cambridge, MA | 20 | 2.7% | |
| Winston-Salem, NC | 14 | 1.9% | |
| Boston, MA | 14 | 1.9% | |
| OSAKA, Japan | 14 | 1.9% | |
| Springfield, MA | 14 | 1.9% | |
| Reston, VA | 12 | 1.6% | |
| Richland, WA | 12 | 1.6% | |
| Other values (188) | 518 | 69.8% |
Length
| Max length | 26 |
|---|---|
| Median length | 13 |
| Mean length | 13.606469 |
| Min length | 2 |
Size
Categorical
| Distinct count | 9 |
|---|---|
| Unique (%) | 1.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| 1001 to 5000 employees | |
|---|---|
| 501 to 1000 employees | |
| 10000+ employees | |
| 201 to 500 employees | |
| 51 to 200 employees | |
| Other values (4) |
| Value | Count | Frequency (%) | |
| 1001 to 5000 employees | 150 | 20.2% | |
| 501 to 1000 employees | 134 | 18.1% | |
| 10000+ employees | 130 | 17.5% | |
| 201 to 500 employees | 117 | 15.8% | |
| 51 to 200 employees | 94 | 12.7% | |
| 5001 to 10000 employees | 76 | 10.2% | |
| 1 to 50 employees | 31 | 4.2% | |
| Unknown | 9 | 1.2% | |
| -1 | 1 | 0.1% |
Length
| Max length | 23 |
|---|---|
| Median length | 20 |
| Mean length | 19.7574124 |
| Min length | 2 |
Type of ownership
Categorical
| Distinct count | 11 |
|---|---|
| Unique (%) | 1.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| Company - Private | |
|---|---|
| Company - Public | |
| Nonprofit Organization | 55 |
| Subsidiary or Business Segment | 34 |
| Government | 15 |
| Other values (6) | 35 |
| Value | Count | Frequency (%) | |
| Company - Private | 410 | 55.3% | |
| Company - Public | 193 | 26.0% | |
| Nonprofit Organization | 55 | 7.4% | |
| Subsidiary or Business Segment | 34 | 4.6% | |
| Government | 15 | 2.0% | |
| Hospital | 15 | 2.0% | |
| College / University | 13 | 1.8% | |
| Other Organization | 3 | 0.4% | |
| School / School District | 2 | 0.3% | |
| -1 | 1 | 0.1% |
Length
| Max length | 30 |
|---|---|
| Median length | 17 |
| Mean length | 17.4245283 |
| Min length | 2 |
| Distinct count | 60 |
|---|---|
| Unique (%) | 8.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| Biotech & Pharmaceuticals | |
|---|---|
| Insurance Carriers | 63 |
| Computer Hardware & Software | 59 |
| IT Services | 50 |
| Health Care Services & Hospitals | 49 |
| Other values (55) |
| Value | Count | Frequency (%) | |
| Biotech & Pharmaceuticals | 112 | 15.1% | |
| Insurance Carriers | 63 | 8.5% | |
| Computer Hardware & Software | 59 | 8.0% | |
| IT Services | 50 | 6.7% | |
| Health Care Services & Hospitals | 49 | 6.6% | |
| Enterprise Software & Network Solutions | 42 | 5.7% | |
| Internet | 29 | 3.9% | |
| Consulting | 29 | 3.9% | |
| Advertising & Marketing | 25 | 3.4% | |
| Aerospace & Defense | 25 | 3.4% | |
| Other values (50) | 259 | 34.9% |
Length
| Max length | 40 |
|---|---|
| Median length | 23 |
| Mean length | 21.9083558 |
| Min length | 2 |
| Distinct count | 25 |
|---|---|
| Unique (%) | 3.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| Information Technology | |
|---|---|
| Biotech & Pharmaceuticals | |
| Business Services | |
| Insurance | |
| Health Care | |
| Other values (20) |
| Value | Count | Frequency (%) | |
| Information Technology | 180 | 24.3% | |
| Biotech & Pharmaceuticals | 112 | 15.1% | |
| Business Services | 97 | 13.1% | |
| Insurance | 69 | 9.3% | |
| Health Care | 49 | 6.6% | |
| Finance | 42 | 5.7% | |
| Manufacturing | 34 | 4.6% | |
| Aerospace & Defense | 25 | 3.4% | |
| Education | 23 | 3.1% | |
| Retail | 15 | 2.0% | |
| Other values (15) | 96 | 12.9% |
Length
| Max length | 34 |
|---|---|
| Median length | 17 |
| Mean length | 17.02695418 |
| Min length | 2 |
Revenue
Categorical
| Distinct count | 14 |
|---|---|
| Unique (%) | 1.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| Unknown / Non-Applicable | |
|---|---|
| $10+ billion (USD) | |
| $100 to $500 million (USD) | |
| $1 to $2 billion (USD) | |
| $500 million to $1 billion (USD) | |
| Other values (9) |
| Value | Count | Frequency (%) | |
| Unknown / Non-Applicable | 203 | 27.4% | |
| $10+ billion (USD) | 124 | 16.7% | |
| $100 to $500 million (USD) | 91 | 12.3% | |
| $1 to $2 billion (USD) | 60 | 8.1% | |
| $500 million to $1 billion (USD) | 57 | 7.7% | |
| $50 to $100 million (USD) | 46 | 6.2% | |
| $25 to $50 million (USD) | 40 | 5.4% | |
| $2 to $5 billion (USD) | 39 | 5.3% | |
| $10 to $25 million (USD) | 32 | 4.3% | |
| $5 to $10 billion (USD) | 19 | 2.6% | |
| Other values (4) | 31 | 4.2% |
Length
| Max length | 32 |
|---|---|
| Median length | 24 |
| Mean length | 23.56199461 |
| Min length | 2 |
| Distinct count | 114 |
|---|---|
| Unique (%) | 15.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 74.71967654986523 |
|---|---|
| Minimum | 15 |
| Maximum | 202 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.8 KiB |
Quantile statistics
| Minimum | 15 |
|---|---|
| 5-th percentile | 35.05 |
| Q1 | 52 |
| median | 69.5 |
| Q3 | 91 |
| 95-th percentile | 127 |
| Maximum | 202 |
| Range | 187 |
| Interquartile range (IQR) | 39 |
Descriptive statistics
| Standard deviation | 30.98059322 |
|---|---|
| Coefficient of variation (CV) | 0.4146242951 |
| Kurtosis | 1.954967771 |
| Mean | 74.71967655 |
| Median Absolute Deviation (MAD) | 19.5 |
| Skewness | 1.109233676 |
| Sum | 55442 |
| Variance | 959.7971562 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 42 | 22 | 3.0% | |
| 65 | 20 | 2.7% | |
| 61 | 18 | 2.4% | |
| 80 | 18 | 2.4% | |
| 81 | 17 | 2.3% | |
| 74 | 16 | 2.2% | |
| 63 | 16 | 2.2% | |
| 60 | 15 | 2.0% | |
| 54 | 15 | 2.0% | |
| 56 | 15 | 2.0% | |
| Other values (104) | 570 | 76.8% |
| Value | Count | Frequency (%) | |
| 15 | 1 | 0.1% | |
| 20 | 3 | 0.4% | |
| 26 | 1 | 0.1% | |
| 27 | 2 | 0.3% | |
| 29 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 202 | 3 | 0.4% | |
| 200 | 3 | 0.4% | |
| 190 | 3 | 0.4% | |
| 176 | 1 | 0.1% | |
| 171 | 1 | 0.1% |
| Distinct count | 160 |
|---|---|
| Unique (%) | 21.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 128.14959568733153 |
|---|---|
| Minimum | 16 |
| Maximum | 306 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.8 KiB |
Quantile statistics
| Minimum | 16 |
|---|---|
| 5-th percentile | 62 |
| Q1 | 96 |
| median | 124 |
| Q3 | 155 |
| 95-th percentile | 208 |
| Maximum | 306 |
| Range | 290 |
| Interquartile range (IQR) | 59 |
Descriptive statistics
| Standard deviation | 45.22032426 |
|---|---|
| Coefficient of variation (CV) | 0.3528713767 |
| Kurtosis | 0.6052151151 |
| Mean | 128.1495957 |
| Median Absolute Deviation (MAD) | 29 |
| Skewness | 0.6244715389 |
| Sum | 95087 |
| Variance | 2044.877726 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 140 | 16 | 2.2% | |
| 119 | 15 | 2.0% | |
| 124 | 15 | 2.0% | |
| 110 | 15 | 2.0% | |
| 127 | 13 | 1.8% | |
| 113 | 13 | 1.8% | |
| 68 | 12 | 1.6% | |
| 101 | 12 | 1.6% | |
| 86 | 12 | 1.6% | |
| 173 | 12 | 1.6% | |
| Other values (150) | 607 | 81.8% |
| Value | Count | Frequency (%) | |
| 16 | 1 | 0.1% | |
| 34 | 2 | 0.3% | |
| 39 | 1 | 0.1% | |
| 48 | 4 | 0.5% | |
| 50 | 6 | 0.8% |
| Value | Count | Frequency (%) | |
| 306 | 3 | 0.4% | |
| 289 | 1 | 0.1% | |
| 275 | 1 | 0.1% | |
| 272 | 1 | 0.1% | |
| 250 | 2 | 0.3% |
| Distinct count | 219 |
|---|---|
| Unique (%) | 29.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 101.43463611859838 |
|---|---|
| Minimum | 15.5 |
| Maximum | 254.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.8 KiB |
Quantile statistics
| Minimum | 15.5 |
|---|---|
| 5-th percentile | 50 |
| Q1 | 73.5 |
| median | 97.5 |
| Q3 | 122.5 |
| 95-th percentile | 167.5 |
| Maximum | 254 |
| Range | 238.5 |
| Interquartile range (IQR) | 49 |
Descriptive statistics
| Standard deviation | 37.54612242 |
|---|---|
| Coefficient of variation (CV) | 0.3701509056 |
| Kurtosis | 0.9764064957 |
| Mean | 101.4346361 |
| Median Absolute Deviation (MAD) | 24.5 |
| Skewness | 0.7895822952 |
| Sum | 75264.5 |
| Variance | 1409.711309 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 87.5 | 12 | 1.6% | |
| 81 | 11 | 1.5% | |
| 140 | 11 | 1.5% | |
| 84.5 | 10 | 1.3% | |
| 107.5 | 10 | 1.3% | |
| 107 | 10 | 1.3% | |
| 85 | 10 | 1.3% | |
| 87 | 9 | 1.2% | |
| 120 | 9 | 1.2% | |
| 154.5 | 8 | 1.1% | |
| Other values (209) | 642 | 86.5% |
| Value | Count | Frequency (%) | |
| 15.5 | 1 | 0.1% | |
| 27 | 2 | 0.3% | |
| 29.5 | 1 | 0.1% | |
| 37.5 | 2 | 0.3% | |
| 39.5 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 254 | 3 | 0.4% | |
| 237.5 | 1 | 0.1% | |
| 232.5 | 1 | 0.1% | |
| 225 | 2 | 0.3% | |
| 221.5 | 1 | 0.1% |
hourly
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| 0 | |
|---|---|
| 1 | 24 |
| Value | Count | Frequency (%) | |
| 0 | 718 | 96.8% | |
| 1 | 24 | 3.2% |
employer_provided
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| 0 | |
|---|---|
| 1 | 17 |
| Value | Count | Frequency (%) | |
| 0 | 725 | 97.7% | |
| 1 | 17 | 2.3% |
job_state
Categorical
| Distinct count | 37 |
|---|---|
| Unique (%) | 5.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| CA | |
|---|---|
| MA | |
| NY | |
| VA | 41 |
| IL | 40 |
| Other values (32) |
| Value | Count | Frequency (%) | |
| CA | 152 | 20.5% | |
| MA | 103 | 13.9% | |
| NY | 72 | 9.7% | |
| VA | 41 | 5.5% | |
| IL | 40 | 5.4% | |
| MD | 35 | 4.7% | |
| PA | 33 | 4.4% | |
| TX | 28 | 3.8% | |
| WA | 21 | 2.8% | |
| NC | 21 | 2.8% | |
| Other values (27) | 196 | 26.4% |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
same_state
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| 1 | |
|---|---|
| 0 |
| Value | Count | Frequency (%) | |
| 1 | 414 | 55.8% | |
| 0 | 328 | 44.2% |
age
Real number (ℝ)
| Distinct count | 102 |
|---|---|
| Unique (%) | 13.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 46.591644204851754 |
|---|---|
| Minimum | -1 |
| Maximum | 276 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.8 KiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | -1 |
| Q1 | 11 |
| median | 24 |
| Q3 | 59 |
| 95-th percentile | 169 |
| Maximum | 276 |
| Range | 277 |
| Interquartile range (IQR) | 48 |
Descriptive statistics
| Standard deviation | 53.77881512 |
|---|---|
| Coefficient of variation (CV) | 1.154258795 |
| Kurtosis | 2.791116976 |
| Mean | 46.5916442 |
| Median Absolute Deviation (MAD) | 17 |
| Skewness | 1.78661821 |
| Sum | 34571 |
| Variance | 2892.160956 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| -1 | 50 | 6.7% | |
| 10 | 32 | 4.3% | |
| 12 | 31 | 4.2% | |
| 24 | 27 | 3.6% | |
| 14 | 24 | 3.2% | |
| 8 | 21 | 2.8% | |
| 9 | 19 | 2.6% | |
| 62 | 18 | 2.4% | |
| 18 | 18 | 2.4% | |
| 36 | 18 | 2.4% | |
| Other values (92) | 484 | 65.2% |
| Value | Count | Frequency (%) | |
| -1 | 50 | 6.7% | |
| 1 | 2 | 0.3% | |
| 3 | 12 | 1.6% | |
| 4 | 5 | 0.7% | |
| 5 | 16 | 2.2% |
| Value | Count | Frequency (%) | |
| 276 | 1 | 0.1% | |
| 239 | 14 | 1.9% | |
| 208 | 1 | 0.1% | |
| 190 | 4 | 0.5% | |
| 174 | 2 | 0.3% |
python
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| 1 | |
|---|---|
| 0 |
| Value | Count | Frequency (%) | |
| 1 | 392 | 52.8% | |
| 0 | 350 | 47.2% |
r
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| 0 | |
|---|---|
| 1 | 2 |
| Value | Count | Frequency (%) | |
| 0 | 740 | 99.7% | |
| 1 | 2 | 0.3% |
sas
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| 0 | |
|---|---|
| 1 | 87 |
| Value | Count | Frequency (%) | |
| 0 | 655 | 88.3% | |
| 1 | 87 | 11.7% |
spark
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 575 | 77.5% | |
| 1 | 167 | 22.5% |
aws
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 566 | 76.3% | |
| 1 | 176 | 23.7% |
sql
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| 1 | |
|---|---|
| 0 |
| Value | Count | Frequency (%) | |
| 1 | 380 | 51.2% | |
| 0 | 362 | 48.8% |
excel
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| 1 | |
|---|---|
| 0 |
| Value | Count | Frequency (%) | |
| 1 | 388 | 52.3% | |
| 0 | 354 | 47.7% |
matlab
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| 0 | |
|---|---|
| 1 | 55 |
| Value | Count | Frequency (%) | |
| 0 | 687 | 92.6% | |
| 1 | 55 | 7.4% |
desc_len
Real number (ℝ≥0)
| Distinct count | 439 |
|---|---|
| Unique (%) | 59.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3910.1725067385446 |
|---|---|
| Minimum | 407 |
| Maximum | 10146 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.8 KiB |
Quantile statistics
| Minimum | 407 |
|---|---|
| 5-th percentile | 1808 |
| Q1 | 2834 |
| median | 3781.5 |
| Q3 | 4772 |
| 95-th percentile | 6683 |
| Maximum | 10146 |
| Range | 9739 |
| Interquartile range (IQR) | 1938 |
Descriptive statistics
| Standard deviation | 1533.827777 |
|---|---|
| Coefficient of variation (CV) | 0.3922660124 |
| Kurtosis | 1.061532802 |
| Mean | 3910.172507 |
| Median Absolute Deviation (MAD) | 982.5 |
| Skewness | 0.7693630273 |
| Sum | 2901348 |
| Variance | 2352627.649 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 4538 | 5 | 0.7% | |
| 3334 | 4 | 0.5% | |
| 2855 | 4 | 0.5% | |
| 5421 | 4 | 0.5% | |
| 2312 | 4 | 0.5% | |
| 1967 | 4 | 0.5% | |
| 4644 | 4 | 0.5% | |
| 2455 | 4 | 0.5% | |
| 3901 | 4 | 0.5% | |
| 5215 | 4 | 0.5% | |
| Other values (429) | 701 | 94.5% |
| Value | Count | Frequency (%) | |
| 407 | 1 | 0.1% | |
| 695 | 1 | 0.1% | |
| 714 | 1 | 0.1% | |
| 745 | 1 | 0.1% | |
| 889 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 10146 | 2 | 0.3% | |
| 9347 | 1 | 0.1% | |
| 9165 | 2 | 0.3% | |
| 8882 | 2 | 0.3% | |
| 8876 | 2 | 0.3% |
level
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| senior | |
|---|---|
| other | |
| higher | |
| junior | 9 |
| Value | Count | Frequency (%) | |
| senior | 442 | 59.6% | |
| other | 154 | 20.8% | |
| higher | 137 | 18.5% | |
| junior | 9 | 1.2% |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 5.79245283 |
| Min length | 5 |
title
Categorical
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| DS | |
|---|---|
| other | |
| A | |
| DE | |
| mle |
| Value | Count | Frequency (%) | |
| DS | 308 | 41.5% | |
| other | 183 | 24.7% | |
| A | 105 | 14.2% | |
| DE | 81 | 10.9% | |
| mle | 65 | 8.8% |
Length
| Max length | 5 |
|---|---|
| Median length | 2 |
| Mean length | 2.685983827 |
| Min length | 1 |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.0539083557951483 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 460 |
| Zeros (%) | 62.0% |
| Memory size | 5.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 3 |
| 95-th percentile | 3 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.384239252 |
|---|---|
| Coefficient of variation (CV) | 1.313434175 |
| Kurtosis | -1.54997464 |
| Mean | 1.053908356 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0.6133645812 |
| Sum | 782 |
| Variance | 1.916118307 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 460 | 62.0% | |
| 3 | 228 | 30.7% | |
| 2 | 41 | 5.5% | |
| 1 | 12 | 1.6% | |
| 4 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 460 | 62.0% | |
| 1 | 12 | 1.6% | |
| 2 | 41 | 5.5% | |
| 3 | 228 | 30.7% | |
| 4 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 4 | 1 | 0.1% | |
| 3 | 228 | 30.7% | |
| 2 | 41 | 5.5% | |
| 1 | 12 | 1.6% | |
| 0 | 460 | 62.0% |
| Distinct count | 343 |
|---|---|
| Unique (%) | 46.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.8 KiB |
| MassMutual | 14 |
|---|---|
| Takeda Pharmaceuticals | 14 |
| Reynolds American | 14 |
| Software Engineering Institute | 11 |
| PNNL | 10 |
| Other values (338) |
| Value | Count | Frequency (%) | |
| MassMutual | 14 | 1.9% | |
| Takeda Pharmaceuticals | 14 | 1.9% | |
| Reynolds American | 14 | 1.9% | |
| Software Engineering Institute | 11 | 1.5% | |
| PNNL | 10 | 1.3% | |
| Liberty Mutual Insurance | 10 | 1.3% | |
| AstraZeneca | 9 | 1.2% | |
| MITRE | 8 | 1.1% | |
| Advanced BioScience Laboratories | 7 | 0.9% | |
| Numeric, LLC | 7 | 0.9% | |
| Other values (333) | 638 | 86.0% |
Length
| Max length | 51 |
|---|---|
| Median length | 13 |
| Mean length | 15.23989218 |
| Min length | 2 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| Rating | Location | Headquarters | Size | Type of ownership | Industry | Sector | Revenue | min_salary | max_salary | avg_salary | hourly | employer_provided | job_state | same_state | age | python | r | sas | spark | aws | sql | excel | matlab | desc_len | level | title | comp | name | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 3.8 | Albuquerque, NM | Goleta, CA | 501 to 1000 employees | Company - Private | Aerospace & Defense | Aerospace & Defense | $50 to $100 million (USD) | 53 | 91 | 72.0 | 0 | 0 | NM | 0 | 47 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 2555 | other | DS | 0 | Tecolote Research |
| 1 | 3.4 | Linthicum, MD | Baltimore, MD | 10000+ employees | Other Organization | Health Care Services & Hospitals | Health Care | $2 to $5 billion (USD) | 63 | 112 | 87.5 | 0 | 0 | MD | 0 | 36 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4828 | higher | DS | 0 | University of Maryland Medical System |
| 2 | 4.8 | Clearwater, FL | Clearwater, FL | 501 to 1000 employees | Company - Private | Security Services | Business Services | $100 to $500 million (USD) | 80 | 90 | 85.0 | 0 | 0 | FL | 1 | 10 | 1 | 0 | 1 | 1 | 0 | 1 | 1 | 0 | 3495 | higher | DS | 0 | KnowBe4 |
| 3 | 3.8 | Richland, WA | Richland, WA | 1001 to 5000 employees | Government | Energy | Oil, Gas, Energy & Utilities | $500 million to $1 billion (USD) | 56 | 97 | 76.5 | 0 | 0 | WA | 1 | 55 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3926 | higher | mle | 3 | PNNL |
| 4 | 2.9 | New York, NY | New York, NY | 51 to 200 employees | Company - Private | Advertising & Marketing | Business Services | Unknown / Non-Applicable | 86 | 143 | 114.5 | 0 | 0 | NY | 1 | 22 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 2748 | senior | DS | 3 | Affinity Solutions |
| 5 | 3.4 | Dallas, TX | Dallas, TX | 201 to 500 employees | Company - Public | Real Estate | Real Estate | $1 to $2 billion (USD) | 71 | 119 | 95.0 | 0 | 0 | TX | 1 | 20 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 3783 | other | DS | 3 | CyrusOne |
| 6 | 4.1 | Baltimore, MD | Baltimore, MD | 501 to 1000 employees | Company - Private | Banks & Credit Unions | Finance | Unknown / Non-Applicable | 54 | 93 | 73.5 | 0 | 0 | MD | 1 | 12 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1808 | senior | DS | 0 | ClearOne Advantage |
| 7 | 3.8 | San Jose, CA | Seattle, WA | 201 to 500 employees | Company - Private | Consulting | Business Services | $25 to $50 million (USD) | 86 | 142 | 114.0 | 0 | 0 | CA | 0 | 15 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 3847 | senior | DS | 0 | Logic20/20 |
| 8 | 3.3 | Rochester, NY | Rochester, NY | 10000+ employees | Hospital | Health Care Services & Hospitals | Health Care | $500 million to $1 billion (USD) | 38 | 84 | 61.0 | 0 | 0 | NY | 1 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1561 | higher | other | 0 | Rochester Regional Health |
| 9 | 4.6 | New York, NY | New York, NY | 51 to 200 employees | Company - Private | Internet | Information Technology | $100 to $500 million (USD) | 120 | 160 | 140.0 | 0 | 0 | NY | 1 | 11 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 4609 | senior | mle | 2 | <intent> |
Last rows
| Rating | Location | Headquarters | Size | Type of ownership | Industry | Sector | Revenue | min_salary | max_salary | avg_salary | hourly | employer_provided | job_state | same_state | age | python | r | sas | spark | aws | sql | excel | matlab | desc_len | level | title | comp | name | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 732 | 4.1 | Palo Alto, CA | Palo Alto, CA | 1 to 50 employees | Company - Private | K-12 Education | Education | Unknown / Non-Applicable | 80 | 142 | 111.0 | 0 | 0 | CA | 1 | 13 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 3526 | senior | mle | 0 | CK-12 Foundation |
| 733 | 3.9 | San Francisco, CA | San Francisco, CA | 51 to 200 employees | Company - Public | Computer Hardware & Software | Information Technology | Unknown / Non-Applicable | 99 | 178 | 138.5 | 0 | 0 | CA | 1 | 12 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 5777 | senior | A | 0 | Life360 |
| 734 | 3.6 | Boston, MA | Springfield, MA | 5001 to 10000 employees | Company - Private | Insurance Carriers | Insurance | $10+ billion (USD) | 37 | 100 | 68.5 | 0 | 0 | MA | 0 | 169 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 5071 | senior | DS | 0 | MassMutual |
| 735 | 3.9 | San Francisco, CA | San Francisco, CA | 201 to 500 employees | Company - Private | Internet | Information Technology | $100 to $500 million (USD) | 62 | 113 | 87.5 | 0 | 0 | CA | 1 | 9 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 3849 | higher | DE | 2 | Fivestars |
| 736 | 3.6 | Plymouth Meeting, PA | Durham, NC | 10000+ employees | Company - Public | Biotech & Pharmaceuticals | Biotech & Pharmaceuticals | $2 to $5 billion (USD) | 86 | 137 | 111.5 | 0 | 0 | PA | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5064 | senior | DS | 3 | IQVIA |
| 737 | 3.9 | Cambridge, MA | Brentford, United Kingdom | 10000+ employees | Company - Public | Biotech & Pharmaceuticals | Biotech & Pharmaceuticals | $10+ billion (USD) | 58 | 111 | 84.5 | 0 | 0 | MA | 0 | 190 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 6219 | senior | other | 3 | GSK |
| 738 | 4.4 | Nashville, TN | San Francisco, CA | 1001 to 5000 employees | Company - Public | Internet | Information Technology | $100 to $500 million (USD) | 72 | 133 | 102.5 | 0 | 0 | TN | 0 | 14 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 6167 | senior | DE | 3 | Eventbrite |
| 739 | 2.6 | Pittsburgh, PA | Pittsburgh, PA | 501 to 1000 employees | College / University | Colleges & Universities | Education | Unknown / Non-Applicable | 56 | 91 | 73.5 | 0 | 0 | PA | 1 | 36 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 3107 | higher | mle | 0 | Software Engineering Institute |
| 740 | 3.2 | Allentown, PA | Chadds Ford, PA | 1 to 50 employees | Company - Private | Staffing & Outsourcing | Business Services | $5 to $10 million (USD) | 95 | 160 | 127.5 | 0 | 0 | PA | 0 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1678 | senior | DS | 0 | Numeric, LLC |
| 741 | 3.6 | Beavercreek, OH | Arlington, VA | 501 to 1000 employees | Nonprofit Organization | Federal Agencies | Government | $50 to $100 million (USD) | 61 | 126 | 93.5 | 0 | 0 | OH | 0 | 53 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3710 | higher | mle | 0 | Riverside Research Institute |
Most frequent
| Rating | Location | Headquarters | Size | Type of ownership | Industry | Sector | Revenue | min_salary | max_salary | avg_salary | hourly | employer_provided | job_state | same_state | age | python | r | sas | spark | aws | sql | excel | matlab | desc_len | level | title | comp | name | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 6 | 2.4 | Hoopeston, IL | Flower Mound, TX | 501 to 1000 employees | Company - Private | Food & Beverage Manufacturing | Manufacturing | $100 to $500 million (USD) | 39 | 66 | 52.5 | 0 | 0 | IL | 0 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2455 | other | other | 0 | Teasdale Latin Foods | 4 |
| 10 | 2.6 | Pittsburgh, PA | Pittsburgh, PA | 501 to 1000 employees | College / University | Colleges & Universities | Education | Unknown / Non-Applicable | 81 | 167 | 124.0 | 0 | 0 | PA | 1 | 36 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5421 | senior | mle | 0 | Software Engineering Institute | 4 |
| 14 | 2.7 | Rockville, MD | Rockville, MD | 201 to 500 employees | Company - Private | Biotech & Pharmaceuticals | Biotech & Pharmaceuticals | $25 to $50 million (USD) | 49 | 113 | 81.0 | 0 | 0 | MD | 1 | 59 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 3809 | senior | other | 0 | Advanced BioScience Laboratories | 4 |
| 24 | 3.0 | Knoxville, TN | Knoxville, TN | 10000+ employees | Company - Private | Gas Stations | Retail | $10+ billion (USD) | 69 | 127 | 98.0 | 0 | 0 | TN | 1 | 62 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 2177 | other | other | 3 | Pilot Flying J Travel Centers LLC | 4 |
| 52 | 3.3 | Milwaukee, WI | Milwaukee, WI | 501 to 1000 employees | Company - Private | Food & Beverage Manufacturing | Manufacturing | Unknown / Non-Applicable | 40 | 68 | 54.0 | 0 | 0 | WI | 1 | 56 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2855 | higher | other | 0 | Palermo's Pizza | 4 |
| 82 | 3.5 | Scotts Valley, CA | Scotts Valley, CA | 501 to 1000 employees | Nonprofit Organization | Health Care Services & Hospitals | Health Care | $500 million to $1 billion (USD) | 42 | 86 | 64.0 | 0 | 0 | CA | 1 | 24 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 3901 | higher | other | 0 | Central California Alliance for Health | 4 |
| 91 | 3.6 | Highland, CA | Highland, CA | 1001 to 5000 employees | Company - Private | Gambling | Arts, Entertainment & Recreation | $100 to $500 million (USD) | 35 | 62 | 48.5 | 0 | 0 | CA | 1 | 34 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 4644 | higher | A | 0 | San Manuel Casino | 4 |
| 93 | 3.6 | Millville, DE | Lewes, DE | 1001 to 5000 employees | Nonprofit Organization | Health Care Services & Hospitals | Health Care | $100 to $500 million (USD) | 42 | 68 | 55.0 | 1 | 0 | DE | 0 | 85 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2840 | other | other | 0 | Beebe Healthcare | 4 |
| 165 | 4.0 | Burleson, TX | Arlington, TX | 1001 to 5000 employees | Hospital | Health Care Services & Hospitals | Health Care | $50 to $100 million (USD) | 36 | 50 | 43.0 | 1 | 0 | TX | 0 | 43 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 5215 | other | other | 0 | Texas Health Huguley Hospital | 4 |
| 0 | -1.0 | Cambridge, MA | San Mateo, CA | Unknown | Company - Private | -1 | -1 | Unknown / Non-Applicable | 100 | 140 | 120.0 | 0 | 1 | MA | 0 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 3334 | higher | other | 0 | Kronos Bio | 3 |